Probably Approximately Correct Learning in Stochastic Games with Temporal Logic Specifications

نویسندگان

  • Min Wen
  • Ufuk Topcu
چکیده

We consider a controller synthesis problem in turnbased stochastic games with both a qualitative linear temporal logic (LTL) constraint and a quantitative discounted-sum objective. For each case in which the LTL specification is realizable and can be equivalently transformed into a deterministic Buchi automaton, we show that there always exists a memoryless almost-sure winning strategy that is "-optimal with respect to the discounted-sum objective for any arbitrary positive ". Building on the idea of the R-MAX algorithm, we propose a probably approximately correct (PAC) learning algorithm that can learn such a strategy efficiently in an online manner with a-priori unknown reward functions and unknown transition distributions. To the best of our knowledge, this is the first result on PAC learning in stochastic games with independent quantitative and qualitative objectives.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints

We consider synthesis of controllers that maximize the probability of satisfying given temporal logic specifications in unknown, stochastic environments. We model the interaction between the system and its environment as a Markov decision process (MDP) with initially unknown transition probabilities. The solution we develop builds on the so-called model-based probably approximately correct Mark...

متن کامل

Symbolic Models for Stochastic Switched Systems: A Discretization and a Discretization-Free Approach

Stochastic switched systems are a relevant class of stochastic hybrid systems with probabilistic evolution over a continuous domain and control-dependent discrete dynamics over a finite set of modes. In the past few years several different techniques have been developed to assist in the stability analysis of stochastic switched systems. However, more complex and challenging objectives related t...

متن کامل

Learning Cooperative Games

This paper explores a PAC (probably approximately correct) learning model in cooperative games. Specifically, we are given m random samples of coalitions and their values, taken from some unknown cooperative game; can we predict the values of unseen coalitions? We study the PAC learnability of several well-known classes of cooperative games, such as network flow games, threshold task games, and...

متن کامل

Approximately Bisimilar Symbolic Models for Stochastic Switched Systems

Stochastic switched systems are a class of continuous-time dynamical models with probabilistic evolution over a continuous domain and controldependent discrete dynamics over a finite set of locations (modes). As such, they represent a subclass of general stochastic hybrid systems. While the literature has witnessed recent progress in the dynamical analysis and controller synthesis for the stabi...

متن کامل

Safe Control under Uncertainty

Controller synthesis for hybrid systems that satisfy temporal specifications expressing various system properties is a challenging problem that has drawn the attention of many researchers. However, making the assumption that such temporal properties are deterministic is far from the reality. For example, many of the properties the controller has to satisfy are learned through machine learning t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016